refactor(pgvector): use async DocumentStore mixin tests#3094
refactor(pgvector): use async DocumentStore mixin tests#3094SyedShahmeerAli12 wants to merge 10 commits intodeepset-ai:mainfrom
Conversation
Closes deepset-ai#3052 Replaces duplicate async test implementations with the mixin classes from haystack.testing.document_store and haystack.testing.document_store_async. - Removes 19 tests now covered by the mixins - Keeps all Qdrant-specific tests (hybrid search, sparse config, collection setup validation, vector preservation)
Closes deepset-ai#3050 Replaces duplicate async test implementations with mixin classes from haystack.testing.document_store and haystack.testing.document_store_async. - Removes 17 tests now covered by the mixins - Keeps all pgvector-specific tests (blob write, connection recreation, invalid connection, empty meta validation, hnsw index, table management)
|
@SyedShahmeerAli12 thanks for the contribution. It seems though this PR is touching qdrant files even though it should only be for pgvector right? |
This PR only covers pgvector. The qdrant async test refactor was accidentally included - reverting to keep scope correct.
get_metadata_field_min_max and its async counterpart now return
{"min": None, "max": None} when the field is not present in the
store (e.g. empty collection) instead of raising ValueError.
Fixes test_get_metadata_field_min_max_empty_collection_async.
| For numeric fields (integer, real), returns numeric min/max. | ||
| For text fields, returns lexicographic min/max based on database collation. | ||
| :raises ValueError: If the field doesn't exist or has no values. | ||
| Returns ``{"min": None, "max": None}`` when the field has no values or the store is empty. |
There was a problem hiding this comment.
We shouldn't use double back ticks in our api docstrings only single backticks
| Returns ``{"min": None, "max": None}`` when the field has no values or the store is empty. | |
| Returns `{"min": None, "max": None}` when the field has no values or the store is empty. |
…ty store
The hardcoded 'content' entry was always included in fields_info even
when the store had no documents. The mixin contract (and common sense)
requires {} for an empty store. Removed the hardcoded initialisation
and updated the local test to match.
pgvector stores embeddings as float32, so round-tripped embeddings lose precision and exact Document equality fails. The same override already exists in test_filters.py for the sync FilterDocumentsTest — applying the same pattern to TestDocumentStoreAsync.
Two fixes: 1. Move 'import pytest' from inside assert_documents_are_equal to top of file (PLC0415 lint rule). 2. Override test_count_not_empty_async: the haystack mixin defines this method without '@staticmethod' or 'self', so pytest passes the class instance as 'document_store'. Adding a proper 'self' override makes fixture injection work correctly.
…viour The mixin raises NotImplementedError by design — each store must declare its own default duplicate policy. pgvector raises DuplicateDocumentError on a second write without an explicit policy, matching the sync test_write_documents override.
CountDocumentsByFilterAsyncTest, CountUniqueMetadataByFilterAsyncTest, and UpdateByFilterAsyncTest were added to haystack.testing.document_store in 2.27.0. The lowest-direct-deps CI run was resolving to 2.26.1 and failing with ImportError.
Coverage report (pgvector)Click to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
|
All tests passing now. Sorry for the initial noise the qdrant file was accidentally included and has been reverted. A few fixes were needed along the way: get_metadata_field_min_max and get_metadata_fields_info now return empty/None results for an empty store instead of raising; added assert_documents_are_equal override to handle float32 precision loss; overrode two mixin methods that had fixture injection issues; and bumped haystack-ai>=2.27.0 since the async test mixin classes were introduced there. |
Related Issues
Closes #3050
What this changes
Refactors
integrations/pgvector/tests/test_document_store_async.pyto inherit from the async mixin classes introduced in deepset-ai/haystack#10799.